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ABSTRACT 


Mosquito-borne disease such as dengue fever is a pervasive public health 
problem around the world and further investigation is needed to rectify 
the misunderstanding of the disease among communities. This requires 
a personalized information delivery, which will effectively fix the problem. 
The process of personalizing information requires several major steps: 
(1) determine the attributes which will be used to interpret a person, 
(11) selects an algorithm which will accurately and efficiently classify the person 
according to the retrieved background information, and (iii) recommends 
the correct information to rectify the particular misunderstanding. 
This research paper considers the first two steps. First, data regarding 
the knowledge, attitudes and prevention practices are determined from 
the established literature where some variables give a significant impact on 
the predictive model. In the second step, five performed machine learning 
algorithms were tested for the classification task. The result indicates that the use 
of support vector machine and decision tree algorithms provide the best 


performance in classifying the person’s understanding regarding the dengue fever. 
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1, INTRODUCTION 

The rate of mosquito-borne diseases (MBDs) outbreak is always increased every time the seasons 
change due to various factors such as socioeconomic and environmental changes, socio-geographical, 
imported cases and human movement patterns [1, 2]. All of the elements can continue the mosquito propagation 
and increase the difficulties of controlling the MBDs as predicting the future consequences in a changing 
environment demands a greater understanding of the array of biological and environmental features [3]. 
Current MBDs outbreak prediction systems were developed based on environmental factors such as flood, 
types of building, daily mean temperature, daily rainfall and vegetation index [4-7]. House Index and Breteau 
Index features were also often included in the predictive model to denote the risk areas [8]. Most studies have 
proven that climate change gives a significant impact on the increasing risk of dengue cases by using a data 
mining classification algorithm [9, 10]. Obtaining such data requires constant observation around the hotspot 
area, but, the environmental monitoring system alone still cannot decrease the infection of dengue virus [1]. 

To improve the prediction framework, the morbidity rate forecasting works better when the vectors 
like Aedes aegypti and Aedes albopictus female mosquitoes and larvae index were selected as parameters as 
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reported in [11]. Thus, this generates an idea to include the attributes based on the human behaviors 
regarding the MBDs as in [12-18] and put it into the predictive model to classify one’s misconception in 
a personalized e-learning system. This research proposes the human knowledge, attitudes and preventive 
practices to be included in the machine learning approach in order to identify and provide a simple MBDs 
awareness program in a personalized e-learning environment. The proposed predictive framework for 
the personalized e-learning environment is illustrated in Figure 1. 


Personalized E-Learning Environment 
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Figure 1. A predictive system to classify the user’s misconception of MBDs 
in a personalized e-learning environment 
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To curb the MBDs particularly in hotspot areas, it is important to ensure that members of the public 
have adequate awareness through effective educational intervention. Social media usage such as Facebook 
and Twitter were the preferred source of dengue information among the public [19]. Health stakeholders 
of many countries often tweet and create a special posting to announce the epidemic outbreak of dengue in 
certain areas. This includes how to control the endemic and to provide early information about disease 
detection to the public. However, accessing such information might be hard for the people who lived in 
the rural and remote areas. Relevant MBDs-based information can be recommended to mold positive 
attitudes and educate the best preventive practices among the children. For instance, personal attributes and 
environmental factors such as region, specific location features and climate contexts were used in [20] 
to predict the human's risk of contracting infectious diseases. While in [21], an application is also made for 
the user to submit the information related to mosquito prevalence (e.g. felt, seen, bitten, heard mosquitoes). 
The public also can report the presence of Aedes that 1s a vector mosquito at the current location immediately 
and report dengue cases in the household or surrounding communities. This open application requires active 
citizen engagement and combined with other data, such as ovitrap egg counts and micro-climate data to 
propose the stakeholders proactively in disease prevention, control and education. However, the application 
might faces underreport when there is no participation from the people at certain places. 


2. RESEARCH METHOD 

Dengue knowledge, attitudes, and practices surveys have been conducted frequently to describe 
the communities’ prevention towards the disease. It is expected that the knowledge of dengue and Aedes is 
high among the public due to the good risk practices for dengue and good risk perception that have been 
promoted through mainstream media and in education. An online community-based survey was carried out 
on the Ist September 2018 until 10th December 2018 with 640 respondents of the Malaysian public aged 
7 years old and above. The questionnaire is posted and shared in a few Malaysian Facebook science-based 
pages and other groups. 

The self-administered structured questionnaire covered all aspects of demographic profiles include 
gender, age, marital status, education level, and housing type. Concerning the awareness of the dengue issue, 
the participants need to answer general knowledge of mosquitoes such as the mosquito’s breeding sites, 
mosquito’s prevention method and mosquito’s active biting time. Concerning to the awareness of dengue 
presence and prevention practices, the participants were asked to provide information regarding the fogging 
frequency in their living area. The fogging activity, in this context, was interpreted as their location might be 
a hotspot area or not. This follows a section on thirteen questions in the True/False setup relevant to 
cleanliness, trash management, water reservoir management and mosquito repellant usage. The logical flow 
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of the questionnaire culminated as a part of the action by the study participants associated with the general 
knowledge of mosquito behavior. 

The online survey is the fastest way to collect data as respondents’ backgrounds are varied while 
keeping their anonymity. The items on the questionnaire were adapted from previously validated instruments 
with some modifications to meet the research objective. Close-ended questions are used to enhance 
the simplicity and easiness of analysis. Respondents just need to choose the best answer according to their 
existing knowledge. 

Both descriptive analysis and machine learning were used in the experiment. Descriptive analysis 
was utilized to summarize the socio-demographic such as age, gender, marital status, ethnicity, highest 
education and type of residence. The machine learning algorithms were tested on 18 attributes of individual 
knowledge, attitudes and practices such as cleanliness; waste management; clogged drains and stagnant water 
handling; aerosol spray or mosquito repellent usage; cloth types and color usage, and the misconception 
of mosquito’s breeding site, mosquito’s active bite time and dengue disease transmission. Table | described 
the selected features that are found relevant to the proposed framework. 


Table 1. Selected features mapping to the personalized e-learning environment 


No. Features Importance 
1. Aware of the Aedes mosquitoes presence within the living area To promote mosquito control practices through 


educational programs where strategies related to 
environmental management can be developed for 
mosquito control when there is an appearance of it. 
2. Fogging in housing area frequency To identify the availability of stagnant water that are 
not disposed of properly or taken care of by the 
communities as a potential breeding site. 


3. Cleaning the area around the house To identify the practices of cleaning where everyone 
4. Dumping trash in the right place lived in a different area and they might have a 
5. Dumping trash that can hold water in the right place different set of attitudes and practices of the cleaning 
6. Cleaning the drains or clogged waterways frequency habit. 

7. Closed or discarded water reservoir before a long vacation To identify the level of preventive practices where 
8. Checking larvae at the water reservoir different demographics might have a different type 
9. Putting Abate/chemicals in water reservoirs to avoid larvae of water management in their living space. 

10. Wearing long-sleeved shirts and long pants to prevent mosquito bites To identify the level of self-protection practices as 
11. Using mosquito repellents such as liquids, coils, or trap in the living area the smallest effort that can be done to avoid 
12. Using an aerosol spray to disperse mosquitoes in the dark area mosquito’s bite. 

13. Wearing bright colored clothes to avoid mosquito bites 


14. Sleeping in the mosquito net 


15. _ Applying mosquito repellent cream on the body 


The machine learning algorithms training and testing were performed by using WEKA application, 
a data mining application developed at the University of Waikato in New Zealand. Different kinds of 
predictive models were built and the forecasting performances of those models were evaluated. 
The algorithms used in the model building are support vector machine (SVM), decision tree (DT), Logistic 
regression (LR), naive bayes (NB) algorithm and artificial neural network (ANN). Tenfold cross-validation 
technique was utilized to obtain the accuracy of each algorithm in classification. Such classification 
algorithms comparison had been done to predict the chronic kidney disease (CKD) where naive bayes 
performs the highest accuracy [22]. 


3. RESULTS AND DISCUSSION 

There are 640 respondents who administered the online survey. When the gender ratio is concerned, 
the relative proportion of females respondents was higher (80%) when compared with male participants 
Table 2. 76.4% of the respondents are aged 18 to 40 years old, 12.5% aged 7 to 17 years old and 11.1% aged 
more than 40 years old where more than 60% of them are married. The largest number of the respondents is 
coming from the Malay ethnic group (93.1%) followed by others (Chinese, Indian, Dusun, Kadazan, 
Melanau, Bidayuh, Brunei, and Iban). Most of the respondents have attained tertiary education (73.5%) 
followed by secondary school (26%) and primary school (0.5%). This illustrates that most respondents are 
well educated and should have minimal knowledge of MBDs especially dengue fever. It is also found that 
most of the respondents lived in a terrace or a twin house (43.7%). 

Amongst the survey participants, 91.9% of respondents declare that they are aware of the presence 
of Aedes mosquitoes in their area shown in Table 2. When asked about the frequency of fogging in their 
residential area, more than 10% of respondents reported that it always happens in their place. Over 50% 
of respondents described that their house has medium to many plants, and near 70% of them are living out 
of the urban area. 
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Table 2. Demographic information about participants surveyed 
N 


Characteristics % 

Sex 

Female 512 = 80 

Male 128 20 
Age (range) 

7-17 80 =: 12.5 

18-40 489 76.4 

Al+ 71 «11.1 
Marital Status 

Married 409 63.9 

Single 231 36.1 
Ethnicity 

Malay 591 93.1 

Others 49 6.9 
Highest education 

Primary school 3 0.5 

Secondary school 166 26 

Tertiary education 470 73.5 
Types of residence 

Flat/Apartment/Condo 168 26.3 

Terrace/Twin House 280 43.7 


Bungalow/Village house 192 30 


The majorities of people surveyed have never experienced dengue fever; however, near 20% of them 
have gone through it before. It is found that 18% of respondents did not clean the area around their house and 
there might be a correlation to the rise of the dengue fever. Furthermore, it is also reported that the prevention 
practices among the respondents were very high, where 98.9% of them throw the trash in the right place. 

It is found that more than 80% of respondents were disposing of the rubbish that can hold water in 
the right place; ensure the water reservoir is closed or discarded before a long vacation, and; always check 
the water reservoir to avoid larvae. Additionally, the number of other characteristics has varied where near 
30% respondents have never cleaned the drains or clogged waterways; 28% of them did not use an aerosol 
spray to disperse mosquitoes at dark area; near 40% respondents did not use mosquito repellent such as 
liquids, coils, or trap in their living area; and only 25% respondents put abate or chemicals in water reservoirs 
to avoid mosquito larvae. Furthermore, near half of respondents wear long-sleeved shirts and long pants to 
prevent mosquito bites and wear bright-colored clothing to prevent mosquito bites. Only 3.9% of respondents 
used a mosquito net for sleeping and 11.4% apply the mosquito repellent cream to their body. 

Although 91.1% of respondents declared that they were aware of the Aedes mosquito’s appearance, 
there were 67% of respondents who cannot answer the three questions correctly as illustrated in Table 3. 
Thus, they are classified as having a misconception of the MBDs issue. The previous study shows the majority 
of respondents just finished secondary school and only 54.6% of them have a high level of knowledge 
regarding the dengue infections [15]. This result indicates that the current number of misconceptions is very 
high among the public although they are highly educated. Respondents might think they have sufficient 
knowledge regarding the dengue through the information they obtained from the television, newspaper, radio, 
social media, and awareness campaign, but that might not comprehend or half delivered where more than half 
of them are found unable to answer the questions correctly. Therefore, it is very important for the education 
stakeholders to include the prediction system into their e-learning system to identify the target users and increase 
their MBDs awareness in a personalized manner. 

In order to correctly identify a person’s understanding regarding the MBDs issue, the performance 
of selected machine learning algorithms was evaluated. The selected algorithms will be used for training 
and testing to classify the person whether they need further intervention based on the 18 features as listed in 
Tables 3 and 4. The true-positive (TP) rate, false-positive (FP) rate, precision, recall, F-measure, and time 
took to build the classifier are presented in Table 5 for each algorithm. It is found that the SVM and DT 
are the most precise algorithms to classify the person’s misunderstanding with the precision of 0.994 
and 0.008 false-positive rate. However, DT took lesser time than SVM to classify due to the input type 
was categorical and not continuous. Surprisingly, ANN performs inferior to the logistic regression model as 
the model-building process is less complicated for logistic regression, and may be considered not worthy 
of the artificial neural networks. In fact, the ANN model can be seen as nonlinear generalizations of logistic 
regression, and thus at least as powerful as that model which in this case, it was not significantly better to use 
neural networks. 

Based on the result, the DT algorithm is found suitable to be used in the predictive model to support 
the objective of the research. Traditional dengue fever classification by using Naive Bayes algorithm has 
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been conducted in [23] to identify positive or negative results, based on several attributes such as fever, 
bleeding, flu, myalgia, and other symptoms with the precision of 0.92. As in [24], the DT algorithm has been 
used to predict patients with severe dengue that may warrant admission based on several variables 
(e.g. vomiting, pleural effusion, and systolic blood pressure) which return precision rate up to 0.81. It is also has 
been used in [25] to propose a patient-monitoring plan and outpatient management of fever in resource-poor 
settings by using the clinical features and laboratory indicators to identify the severity of the dengue cases. 

A study suggests the increase of knowledge through social media can influence the positive attitude 
and practices of dengue prevention measures [26]. However, social media alone cannot be personalized to 
fulfill the criteria as human attitude regarding the MBDs was not taken into consideration. This study is 
proposing the misconception of the disease can be identified among the public in the form of a quiz upon 
using the e-learning system. Once the system obtained enough information regarding the user’s awareness 
background, it will analyze and personalized the relevant information regarding the MBDs between 
the learning sessions. The information is also can be fed directly to the target user, such as the children. 
His can be done based on a few rules where the children might be at a very young age of using the Internet or 
the children who lived in rural areas. Otherwise, feeding such information to a non-target user will be 
wasting the resources. Further investigation is needed to identify the differences of culture, socioeconomic 
background, and location which may also contribute to the preventive practices among the children. 
Thus, this paper highlights the big opportunity for the health authorities and education stakeholders to 
promote such campaigns through e-learning systems or any available open learning platform. 


Table 3. Awareness of dengue presence and prevention practices 


No. Features N % 
I am aware of the presence of Aedes mosquitoes in my area 
i Yes 588 91.9 
No 52 8.1 
What is the frequency of fogging in your housing area? 
Always 65 10.2 
Di Sometimes 173 2 
Seldom 280 43.8 
None 122 19.1 
I often clean the area around the house 
3. Yes 525 82 
No 115 18 
I dump trash in the right place 
4. Yes 633 98.9 
No 7 1.1 
I dump trash that can hold water in the right place 
ae Yes 559 87.3 
No 81 12.7 
I often clean the drains or clogged waterways 
6. Yes 460 71.9 
No 180 28.1 
I make sure the water reservoir is closed or discarded before a long vacation 
te Yes 552 86.3 
No 88 13.7 
I always use an aerosol spray to disperse mosquitoes in the dark area 
8. Yes 461 J2 
No 179 28 
I always check the water reservoir so there are no larvae 
9. Yes 547 85.5 
No 93 14.5 
I use mosquito repellents such as liquids, coils, or trap in my living area 
10. Yes 395 61.7 
No 245 38.3 
I put Abate/chemicals in water reservoirs to avoid larvae 
ll. Yes 161 74.8 
No 479 252 
I wear long-sleeved shirts and long pants to prevent mosquito bites 
12. Yes 324 50.6 
No 316 49.4 
I wear bright colored clothes to avoid mosquito bites 
13. Yes 310 48.4 
No 330 51.6 
I sleep in the mosquito net 
14. Yes 25 3.9 
No 615 96.1 
I always apply mosquito repellent cream on my body 
15. Yes 73 11.4 
No 567 88.6 
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Table 4. Knowledge of Aedes characteristics 
No. Questions N % 
I know Aedes mosquito breeding in the following areas: 
Stagnant water 
1. Septic tank 
Drain with waste 
Container with clear water 
I know Aedes mosquitoes actively bites at the following times: 
Dawn 
2s, Afternoon 
Evening 
Night 
How is dengue transmitted? 
Mosquito bites 
Contaminated and exposed foods 
Don’t know 
Correctly answer all questions 211 33 


Incorrectly answer some/all questions 429 67 


Table 5. Performance comparison of the classification algorithms 
Algorithms TPrate FPrate Precision Recall F-measure Time taken (second) 


NB 0.991 0.009 0.991 0.991 0.991 0.01 
LR 0.991 0.012 0.991 0.991 0.991 0.24 
ANN 0.986 0.012 0.986 0.986 0.986 6.32 
SVM 0.994 0.008 0.994 0.994 0.994 0.16 
DT 0.994 0.008 0.994 0.994 0.994 0.05 


4. CONCLUSION 

Based on the awareness survey, the majority of respondents applied good practices in dengue 
preventive measures. An effective prediction model to detect the MBDs misconception is still needed to 
identify the communities’ awareness and enhance their knowledge by recommending information about 
dengue prevention and early symptom recognition. A predictive system should be embedded to analyze 
the user’s behavior and provide personalized information. Several features that have been selected for 
the study are found relevant for the proposed framework. This research proposes the Decision Tree algorithm 
in the predictive model construction of the online dengue awareness program by taking into account 
the human behavior and their current knowledge regarding the MBDs. The research focuses on Aedes 
and dengue transmission due to the high number of dengue fever and dengue hemorrhagic fever cases in Malaysia 
compared to other MBDs. Considerably more work is needed to identify features and causes of other MBDs such 
as Malaria, Zika, and Chikungunya including geospatial analysis and environmental factors. 
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